27 research outputs found

    Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions

    Full text link
    Pruning is one of the predominant approaches for compressing deep neural networks (DNNs). Lately, coresets (provable data summarizations) were leveraged for pruning DNNs, adding the advantage of theoretical guarantees on the trade-off between the compression rate and the approximation error. However, coresets in this domain were either data-dependent or generated under restrictive assumptions on both the model's weights and inputs. In real-world scenarios, such assumptions are rarely satisfied, limiting the applicability of coresets. To this end, we suggest a novel and robust framework for computing such coresets under mild assumptions on the model's weights and without any assumption on the training data. The idea is to compute the importance of each neuron in each layer with respect to the output of the following layer. This is achieved by a combination of L\"{o}wner ellipsoid and Caratheodory theorem. Our method is simultaneously data-independent, applicable to various networks and datasets (due to the simplified assumptions), and theoretically supported. Experimental results show that our method outperforms existing coreset based neural pruning approaches across a wide range of networks and datasets. For example, our method achieved a 62%62\% compression rate on ResNet50 on ImageNet with 1.09%1.09\% drop in accuracy

    Deep Learning on Home Drone: Searching for the Optimal Architecture

    Full text link
    We suggest the first system that runs real-time semantic segmentation via deep learning on a weak micro-computer such as the Raspberry Pi Zero v2 (whose price was \15)attachedtoatoy−drone.Inparticular,sincetheRaspberryPiweighslessthan15) attached to a toy-drone. In particular, since the Raspberry Pi weighs less than 16grams,anditssizeishalfofacreditcard,wecouldeasilyattachittothecommoncommercialDJITellotoy−drone(<$100,<90grams,98 grams, and its size is half of a credit card, we could easily attach it to the common commercial DJI Tello toy-drone (<\$100, <90 grams, 98 \times92.5 92.5 \times$ 41 mm). The result is an autonomous drone (no laptop nor human in the loop) that can detect and classify objects in real-time from a video stream of an on-board monocular RGB camera (no GPS or LIDAR sensors). The companion videos demonstrate how this Tello drone scans the lab for people (e.g. for the use of firefighters or security forces) and for an empty parking slot outside the lab. Existing deep learning solutions are either much too slow for real-time computation on such IoT devices, or provide results of impractical quality. Our main challenge was to design a system that takes the best of all worlds among numerous combinations of networks, deep learning platforms/frameworks, compression techniques, and compression ratios. To this end, we provide an efficient searching algorithm that aims to find the optimal combination which results in the best tradeoff between the network running time and its accuracy/performance

    Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models

    Full text link
    As autonomous driving technology matures, end-to-end methodologies have emerged as a leading strategy, promising seamless integration from perception to control via deep learning. However, existing systems grapple with challenges such as unexpected open set environments and the complexity of black-box models. At the same time, the evolution of deep learning introduces larger, multimodal foundational models, offering multi-modal visual and textual understanding. In this paper, we harness these multimodal foundation models to enhance the robustness and adaptability of autonomous driving systems, enabling out-of-distribution, end-to-end, multimodal, and more explainable autonomy. Specifically, we present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text. To do so, we introduce a method to extract nuanced spatial (pixel/patch-aligned) features from transformers to enable the encapsulation of both spatial and semantic features. Our approach (i) demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations, and (ii) allows the incorporation of latent space simulation (via text) for improved training (data augmentation via text) and policy debugging. We encourage the reader to check our explainer video at https://www.youtube.com/watch?v=4n-DJf8vXxo&feature=youtu.be and to view the code and demos on our project webpage at https://drive-anywhere.github.io/.Comment: Project webpage: https://drive-anywhere.github.io Explainer video: https://www.youtube.com/watch?v=4n-DJf8vXxo&feature=youtu.b

    Adult Ocular Toxocariasis Mimicking Ciliary Body Malignancy

    Get PDF
    Purpose. To discuss an unusual presentation of ocular toxocariasis. Methods. Case report. Results. A 40-year-old woman presented with decreased vision in the left eye with a long history of recurrent red eye from uveitis. Eosinophilia and positive ELISA titers for Toxocara canis favored the diagnosis of ocular toxocariasis. Over 3 months, an anterior scleral mass had a rapid growth raising the possibility of medulloepithelioma, which rarely can mimic uveitic syndromes. Surgical plan changed from local excision to enucleation. Histopathology demonstrated a large homogeneous mass of chronic inflammatory cells with inflammation of the overlying thinned out sclera, medial rectus insertion, and limbal cornea. The triad of peripheral granuloma, eosinophilia, and positive blood serology established the diagnosis of ocular toxocariasis. Conclusions. Ocular toxocariasis can mimic ocular malignancy such as medulloepithelioma in adults and rarely presents as an anterior scleral mass
    corecore